Mathematical model of interest matchmaking in electronic social networks 
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I. INTRODUCTION 

Social activities in electronic networks play an increas- 
ingly important role in our every-day lives. We are 
exchanging important information via electronic mails, 
wikis, web-based forums, or blogs, and meet new friends 
or business contacts in Internet communities and social 
network services. Parallel to this growing socialization of 
the World Wide Web, the requirements on the electronic 
services become more ambitious. Huge data quantities 
have to be processed, user-friendly interfaces arc to be 
designed, and more and more sophisticated computations 
must be implemented to offer complex solutions. 

This paper studies a special aspect of social network 
services, the matchmaking problem. In essence it asks, 
given a search profile, for advertising profiles matching 
it best. This problem is in principle well-known in Grid 
computing, where computational tasks arc seeking for 
appropriate resources such as CPU time and memory 
space on different computers. In electronic social net- 
works, however, the problem is more general because not 
only specified attribute ranges are to be compared but 
more or less vaguely describable interests. 

The aim of this paper is to formulate a mathemati- 
cal model for the problem of matchmaking of attribute 
ranges and fields of interests in electronic social net- 
works. It tackles the following fundamental questions. 
How can an appropriate system and its data structure 
be designed? How is the mathematical formulation of a 
matching problem as an optimization problem? In par- 
ticular, what is its search space, what is its objective 
function? Whereas the idea to use a fuzzy function to 
calculate the matching degree of two numerical ranges 
may suggest itself, how could a function calculating a 
matching degree of two fields of interest look like? One 
of the central results of this paper is the proposal of a pre- 
cise definition of such a function computing this matching 
degree and the presentation of a concrete example. 

The paper is organized as follows. After a definition 
of electronic social networks is given in the next section, 
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a mathematical model of the matchmaking problem as 
an optimization problem is proposed, especially the data 
structure of search and advertising profiles, the search 
space, and the matching degree as the objective function. 
A short discussion concludes the paper. 



II. ELECTRONIC SOCIAL NETWORKS 

A social network consists of a finite set of actors and 
the direct relations defined on them. An actor here may 
be an individual, a group, or an organization, and the 
direct relation between two actors indicates that they di- 
rectly interact with each other, have immediate contact, 
or are connected through social familiarities, such as ca- 
sual acquaintance or familial bonds 0, [l2j ■ Thus a social 
network is naturally represented by a graph in which each 
node represents an actor and each edge a direct relation. 
Empirically, the mean number of direct relationships of 
an individual in a biological social network depends on 
the size of the neocortex of its individuals; the maximum 
size of such relationships in human social networks tends 
to be around 150 people ("Dunbar's number") and the 
average size around 124 people ((J. 

Since the popularization of the World Wide Web 
in the middle of the 1990's, there emerged several 
Internet social networks, maintained by social net- 
work services such as "circle of friends" like friendster 
(www.jriendster.com), MySpacc (www.myspace.com), or 
orkut \www.orkut.com\ . as platforms for business profes- 
sionals like XING ( www. xing.comty , or virtual worlds like 
Second Life ( secondlife. com [ ) . Internet social networks 
are instances of electronic social networks. 

In this paper, an electronic social network is defined 
as a network of at least three human individuals or or- 
ganizations which use essentially, albeit not exclusively, 
electronic devices and media to get in contact and ac- 
quaintance, to meet new partners, to communicate, and 
to exchange information. Examples of electronic social 
networks are Internet social networks, as well as video- 
conference sessions and conference calls, especially if they 
serve to meet new people as in party lines, or as long 
as they admit spontaneous communication between each 
member of the network. 
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III. THE MATCHMAKING PROBLEM 



1. Definitions 



In computer science, the term matching in general 
refers to the process of evaluating the degree of similarity 
or of agreement of two objects. Each object is character- 
ized by a set of properties or attributes, which in many 
systems are given by name- value pairs Jj|. Matching 
plays a vital role in many areas of computer science and 
communication systems. For instance, it is studied for re- 
source discovery and resource allocation in grid comput- 
ing where matchmaking services are needed to interme- 
diate between resource requesters and resource providers 
[H]. Other examples are given by the problem of match- 
ing demands and supply of business or personal profiles 
in online auctions, e-commerce, recruitment agencies, or 
dating services. 



A. Profiles 

In most matching problems, the objects under consid- 
eration take asymmetric roles, viz., some try to search 
for information or request for a service, others try to ad- 
vertise information or provide a service. A single object 
may naturally do both activities at a time, in electronic 
social networks this even is the usual case. In the sequel 
we will therefore more accurately consider the matching 
of a search profile, containing the requested information, 
and an advertising profile presenting the provided infor- 
mation. 

Given a specific search profile, the matchmaking prob- 
lem then is to find those advertising profiles which match 
it best, in a sense to be specified in the sequel. Gener- 
alizations of this problem ask for best global matchings, 
given a whole set of search profiles and a set of adver- 
tising profiles. For instance, the global pairwise match- 
making problem seeks pairs of search/advertising-profiles 
such that the entity of the pairs matches the best under 
the constraint that any profile is member of at most one 
pair, the global multiple matchmaking problem searches 
for possibly multiple combinations of search and adver- 
tising profiles which as a whole match the best. The 
pairwise version of the problem typically occurs for dat- 
ing services or classical marriage matchmaking tasks, 
whereas the multiple version appears in grid computing 
or in brokering interest groups. 

In this paper we will focus on the local version of the 
matchmaking problem, i.e., finding an optimum advertis- 
ing profile to a specified search profile. Thus the match- 
making problem is an optimization problem, and to for- 
mulate it precisely we have to specify the search space 
and the objective function. The search space will turn 
out to be the set of pairs of the fixed search profile and 
the advertising profiles, and the objective function will 
be a function measuring the "matching degree." We will 
work out these notions in the next sections. 



A profile consists of its owner, being an actor of the 
electronic social network, a list of attributes of a given 
set A together with their values, a list of attribute sten- 
cils where each stencil represents a pair of an attribute 
name and its value range, and a list of fields of interest 
specifying their respective levels of interest. Attributes 
are properties of the profile owner such as age, height, 
weight, eye color, or hair color, and we therefore sub- 
sume them under the class "Owner" (Fig. [T|). In princi- 



Owner 


id: 


String 


heij 


;ht: integer 


eye 


color: String 



Profile 



1 

i 



Interest 



field: String 
level: [-1,11 



Stencil 



attribute: String 
range: & 



FIG. 1: UML diagram of the data structure of a profile and its 
relationship to the owner's attributes, the attribute stencils and 
the fields of interest. An attribute stencil consists of an owner's 
attribute name and its (searched or advertised) range. 

pie, there are two different types of attributes, subsumed 
in the two disjoint sets N and D such that the set A of 
attributes separates as 



A = NUD. 



(1) 



The set N consists of the numerical attributes of the 
owner which take integer or real numbers as values, the 
set D consists of discrete non-numerical values. (The 
difference between numerical and non-numerical is not 
sharp, for instance a string could well be considered as 
numerical via a symbol code, as well as non-numerical 
since it seldom makes sense to multiply or divide strings; 
in most cases, strings are better considered as non- 
numerical.) 

Correspondingly, the stencil of an attribute is deter- 
mined by the attribute's name and its range, being of a 
certain set called Type 2? ' , 



(2) 



where jY denotes the set of ranges for the numerical 
attributes, 



JY C {[a, b] : — oo ^ a, b ^ oo}, 



(3) 



i.e., jY is a set of closed intervals [a,b] C K, and @ 
denotes the set of ranges for the discrete non-numerical 
attributes, 



C {E : E is a finite set or enum}, 



(4) 



i.e., ^ is a finite set or enum, specified by the respec- 
tive owner attributes determined by the system model. 
We allow the empty set as null element in jV and *2l . 
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If a given range R £ 3 contains only one element, say 
R = {x}, then the stencil is often shortly written "p = x" 
instead of "p € R." If, on the other hand, R = [x, oo] 
then we may write "p > x" instead of p £ R. For in- 
stance, "height = 180" means "height € [180,180]," or 
"height l 180" means "height g [180, oo]." 

On the other hand, a field of interest is a name-value 
pair specifying the field itself as well as its level ranging 
on a scale from —1 to 1, coded by the interpolation of 
the following table, 



Level 


Meaning 


-1 


aversion 





indifference 


1 


enthusiasm 



(5) 



The set of fields of interests is denoted by / and is a 
subset of words of a specified alphabet E, 

ICS* (6) 

Usually, £ is the set of ASCII or Unicode symbols. The 
set / determines the set of all fields of interests available 
to the system. Depending on the system design, it may 
be a fixed set of words, or an arbitrary word over the 
alphabet £. 



B. The search space 

Given a set 6 of search profiles s and a set 21 of adver- 
tising profiles a as input, the search space S of a global 
matchmaking problem is given by all pairs of search and 
advertising profiles, i.e., S s — 6 x 21. In this paper, how- 
ever, we are considering the local matchmaking problem, 
given a single search profile s, i.e., & = {s}, and the 
search space 

S = {s} x 2t(s). (7) 

where 21(b) = {a £ 21 : owner (a) ^ owner (s)}. A search 
profile s itself is a set of the given attribute stencils n s , 
U s , and of fields of interest i s , 



where 



s = n s U S U i s , 



{{p,R s {p)):p£N s } 



(8) 



is the set of attribute-range pairs, with the given map- 
ping R s : N s — > JV from the set N s of the searched nu- 
merical attributes to their associated desired ranges (R s 
associates to each numerical attribute p in N s an interval 

d s = {(p,E s (p)):p£D s } 

is the set of attribute-set pairs, with the mapping E s : 
D s — > @ from the given set D s of searched discrete at- 
tributes to their desired sets or enums, (E s associates 
discrete nonnumerical attribute p a set E s (j>)), and 

i s = {(p,l s (p)) --pels} 



is the set of searched fields of interest with their desired 
levels, with the given mapping l s : I s — > [—1,1]). Note 
that each of the pairs (p, R s ), (p, E s ), (p, l s ) can be easily 
implemented as a table or a hash map. Analogously, an 
advertising profile is given by 



a = n a U D a , U i a 



(9) 



where the three sets are defined the same way as in the 
search case, with the index 's' (for "search" ) replaced by 
'a' (for "advertising"). 

Example 1. In grid computing, a main matching prob- 
lem is resource discovery and resource allocation (j, §]. 
Assume a toy grid consisting of two resource providers 
Haegar and Bond, and two resource requests by some 
computational process. In our terminology, Bond and 
Haegar each offer an advertising profile, whereas the 
requests are represented by search profiles. Moreover, 
in the widely used matchmaking framework Condor- G 
[10, SEE!, the profiles are called Class Ads (classified 
advertisements). Let us assume the profiles according to 
the following tables. 



Search Profiles 


owner = 194.94.2.21 


owner = 194.1.1.3 


CPU > 1.6 GHz 
memory > 1 GB 


memory > 2 GB 



Advertising Profiles 


owner = bond.cs.ucf.edu 


owner = 194.94.2.20 


CPU < 3.6 GHz 
memory < 4.0 GB 


CPU = 2.5 GHz 
memory = 1.0 GB 



In each column of a profile there is listed its owner and 
some attributes and their values. □ 

Example 2. Assume a small social network for pooling 
interest groups, consisting of three persons, Alice, Bob, 
and Carl, who provide search and advertising profiles ac- 
cording to the following tables. 



Search Profiles 


owner = Alice 


owner = Carl 


age G [20,40] 


age 6 [20,30] 
height > 180 


tennis = 1.0 
chess = 0.5 


basketball = 1.0 



Advertising Profiles 


owner = Alice 


owner = Bob 


owner = Carl 


age = 25 
height = 165 


age = 26 
height = 182 


age = 31 
height = 195 


tennis = 1.0 
chess = 0.5 
basketball = 0.5 


tennis = 0.5 
basketball = —1.0 


basketball = 1.0 
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In each column of a profile there is listed its owner, some 
attributes and their values, and the fields of interests 
with their levels. For instance, Alice looks for someone 
between 20 and 40 years of age being enthusiastic in ten- 
nis and having some penchant to chess, whereas Carl 
seeks a tall person in the 20's with highest preference for 
basketball. Looking at the advertising profiles in this so- 
cial network, one sees that Alice may contact Bob, but 
Carl cannot find an ideal partner in this community. On 
the other hand, Alice would be a "better" partner for 
Carl than Bob, since she is partly interested in basket- 
ball. Formally Alice's search profile, for instance, is given 
as follows. The sets for the searched attributes and fields 
of interest are 

N s = {age}, D s = 0, I a = {tennis, chess}, (10) 

the mapping R s is given by 



Rs(p) 



[20, 40] if p = "age," 
otherwise. 



Ill) 



and the mapping l s is given by the table 



p 


tennis chess 


kip) 


1.0 0.5 



The mapping E s does not exist since D s = 
up, Alice's search profile is given by 



(12) 
To sum 



SAiicc = {(age, [20,40])} 

U {(tennis, 1.), (chess, .5), (basketball, .5)}, (13) 

Note in particular that n 5i Aiico = 0- On the other hand, 
the advertising profiles read 



2l(SAHce) = {tlBob, ClCarl} 



where 



a B ob = {(age, [26, 26]), (height, [182,182])} 
U {(tennis, .5), (basketball, —1.)}. 

acari - {(age, [31,31], (height, [195,195])} 
U {(basketball,!.)}. 



With the definitions 

Sb = (s Alice > iBob), «C 



[5 Alice, ClCarlJ 



(14) 



(15) 



(16) 



(17) 



the search space S = {s Alice} x {oBob, <*Cari} = {s B ,s c } 
consists of two feasible solutions. □ 



C. A matching degree function 

The matching degree of a search profile and an adver- 
tising profile is a real number /, typically ^ / 5^ 1, 



with / = meaning "total mismatch" and / = 1 mean- 
ing "perfect match." In general, the matching degree will 
be the weighted sum of several partial matching degrees, 
one for each property separately. Moreover, the match- 
ing degree of an attribute is calculated in a different way 
than the matching degree of a field of interest. Proposals 
for these different kinds of matchings arc introduced in 
the following paragraphs. 



1. Matching degree of an attribute range 

A function measuring the matching degree of ranges 
of an attribute has to quantify how a given advertised 
attribute stencil [a a , 6 a ] fits into the stencil pattern given 
by the corresponding range in the search profile. In case 
of a numerical attribute, the stencil is given by a closed 
interval [a, b] € 3 in case of a discrete-value attribute it 
is a set or enum E. 

a. Numerical attribute intervals matching. To de- 
termine the matching degree of a searched value range 
[a s ,6 s ] with a given advwertised value range x € K, we 
define the fuzzy step function h e (x) = h e (x;a,b) with 
a ^ b and < e < 1, as 



h e {x; a, b) 



ae 
1 

1+e 



if (1 — e)a < x ^ a, 
if a < x ^ b, 



ifb<x^ (l + e)6, 



(18) 







otherwise. 



(Fig. [5]). The parameter e is called the fuzzy level. It 
denotes the relative length of the fuzzy transition region. 
The smaller e, the narrower this region, and the more 
accurate an advertised attribute value must fit into the 
searched interval. In the limit e — ► 0, the function h e is 




FIG. 2: The fuzzy step function h e (x) of Eq. fl8t . 

the step function, and for a — * — oo or 6 ^ oo, it tends 
to one of the Heaviside step functions Hb(~x) or H a (x), 
respectively. 

If, for instance, the searched attribute is "height > 
180" and an advertised attribute is "height = 165" then 
for a fuzzy level of e = 10%, we have 



/i .i(165;180,oo) 



i.e., the matching degree is 16.7%. Then the matching 
degree of two numerical ranges [a s ,t> s ] as search range 
and [a a , 6 a ] as advertised range are given by 



m n ([a s ,b s ], [a a ,6 a ]; e) = 

max[/i e (6 a ;a s ,6 s ), h e (b s ; a a , 6 a )] 



(20) 
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b. Features matching finite sets or enums. If the val- 
ues of specific attribute are constrained to be of a finite 
set, or an enum, say E then the matching degree is deter- 
mined by the Boolean characteristic function \e defined 
by 



Xe{x) 



1 if x G E, 
otherwise. 



(21) 



If the searched attribute, for instance, is "name G 
{'Smith', 'Taylor'}" and the advertised attribute is 
"name = 'Tailor'" then E = {'Smith', 'Taylor'} and 
Xe ('Tailor') = 0, i.e., the matching degree is zero. Since 
the owner of an advertising profile can advertise at most 
one value for an attribute, we have 



m d (E,{x}) = xe(x). 



2. Matching degree of a field of interest 



(22) 



First we notice that the matching degree as a function 
of the levels of interest l s for the search profile and Z a for 
the advertising profile must be asymmetric. For instance, 
if l s = and l a = 1, i.e., the search is indifferent with 
respect to the field of interest, and the advertising profile 
has Ig. = 1, then the matching degree should be greater 
than 0, but if the search requires l s = 1 and l a — then 
the matching degree should be zero. In the first case, the 
searcher is indifferent about the field of interest, in the 
second case he demands high interest. 

Definition 3. An interest matching degree function is 
a function m : [— 1,1] 2 — -> [0,1] such that the following 
conditions are satisfied. 



(^s, ^a) 


(x,x) (0,±1) (±1,0) 


m(l s ,l a ) 


1 \ 



(Vie [-1,1]) (23) 



The first condition expresses the perfect matching of the 
diagonal, the second the search indifference, and the last 



the search necessity. 

A possible matching degree function is given by 
mi(l s ,l a ) = max[cp(l s ,l a ), 0] 

where 



(f(x,y) = 1 - 



{c?-l){x-yf 



2 + (x - cyf 



with 



1 + V7 



1.823. 



□ 



(24) 



(25) 



(26) 



By construction, m{l s ,l a ) satisfies the conditions in 
and therefore is an interest matching degree function. It 
is asymmetric with respect to its arguments, since we 
have m(l s ,l a ) ^ m(l a ,l s ) if and only if ^ l\. On 
the other hand, it is an even function, i.e., m(l s ,l a ) = 
m(-l s ,-l a ). 




FIG. 3: The matching degree function m = m(i s ,i a ) in Eq. i ]24[ ). 



3. The total matching degree function 

Putting together all partial matching degrees consid- 
ered above, we have to construct a function / : S — ► [0, 1] 
as a weighted sum of them. We notice that any s G S rep- 
resents a feasible solution of the matchmaking problem 
and has the form s = ($, a) where s is the given search 
profile ([8]) and a is one of the given advertising profiles 
<j9j) in the network. Then / defined for each s G S by 



f(s) 



E 
E 

p€D s 

E 



m n (R s (p),R a (p)) 



\N S 



m d (E(p),T a (p)) 
\N S \ + \D S \ + \I S \ 

m^kijp),!^)) 



IDs 



(27) 



where R a (p) and E(p) denote the attribute ranges of the 
attribute p, l a (p) is the advertised interest level of the 
field of interest p, and the vertical bars | • | embracing a 
set denote the number of its elements. 

Thus for the computation of the matching degree, the 
attributes and fields of interest of the search search pro- 
file s arc leading, i.e., it is s which determines what is 
tried to be matched. The, if an attribute p of the search 
profile does not occur in the advertising profile, then the 
matching degree functions m n (j>) and m,d(p) vanish by 
definition. If, however, a searched field of interest p G i s 
does not occur in the advertised profile, then it is the level 
l a (p) which vanishes by definition. Note the crucial dif- 
ference between null values of attributes and null values 
of fields of interest in the advertising profile: searched at- 
tributes are mandatory, and at least with respect to this 
attribute there is a complete mismatch; if a field of in- 
terest, however, does not occur in the advertised profile, 
it is indifferent to its owner, but depending on the level 
of interest in the search profile, the matching degree may 
be positive nonetheless. 

Example [2] (revisited). For Alice's search space we 
have the two solutions (jTTJ) , i.e.. 



f(s B ) 



m„([20, 40], [26, 26]) m,(l, .5) + m,(.5, 0) 



1 .5636 +.6308 
3 + 3^ 



0.7315 



(28) 
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and 



of all weights w n : N 



w d : N 



N ■ 



f(sc 



m„([20, 40], [31,31]) m,(l., 0) + mi(.5, 0) 



1 + .6308 



= 0.5436 



(29) 



Hence Bob's advertising profile has a matching degree 
of 73.15% with Alice's search profile, whereas Carl's 
matches it only by 54.36%. □ 



Notice that the objective function ([27)1 is constructed 
in such a way that each searched item p of a search profile 
has equal weight. If, however, each item should have 
its own weight w(p), then the objective function POI) is 
easily be modified to 

/(s) = £ m n (R s (p),R a (p)) 



pEN s 



W t ot 



Wtot 



pi/. Wtot 



(30) 



where u>tot is the total sum 



p<en s P £D S pei a 



IV. DISCUSSION 

In this paper, a mathematical model of the match- 
making of search and advertising profiles in an electronic 
social network is proposed. Basing on the data structure 
described by Figure [T] and distinguishing between match- 
making of attribute ranges via stencils and matchmaking 
of fields of interests via comparison, the matchmaking 
problem is formulated as an optimization problem, with 
the search space consisting of a fixed search profile and 
several advertising profiles as in Eq. ([7|) and the match- 
ing degree as its objective function in Eq. (|27p . The main 
difficulty is to define a function measuring adequately the 
matching degree of two fields of interest and obeying the 
necessary conditions listed in Definition [3l A proposed 
solution is the function given in Eq. (|24p and depicted in 
Figure [3j The implementation of a matchmaking service 
in an electronic social network basing on this matching 
optimization is straightforward. 
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